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® Genomic cloning and mapping. 

@ A method for cloning genomic ONA comprises the steps of: 

(a) digesting genomic DNA with a restriction enzyme that generates multiple different cohesive-end se- 
quences; 

(b) contacting the digested genomic DNA with a linker, said linker comprising a reporter nucleotide sequence 
and a cohesive-end sequence complementary to a cohesive-end sequence generated by the restriction 
enzyme of step (a) under conditions such that a linker is joined to a restriction fragment of genomic DNA; and 

(c) amplifying the product of step (b). 

Likewise, a method for mapping genomic DNA restriction fragments comprises the steps of: 

(a) digesting genomic DNA with a restriction enzyme that generates multiple different cohesive-end se- 
quences; 

(b) amplifying the restriction fragment products of step (a); 

(c) determining the cohesive-end sequence of each of said restriction fragments; and in accordance therewith, 

(d) aligning the restriction fragments in a physical map. 
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This application is a divisional application of our European application no. 90904029.7, herein referred 
to as the parent application. 

The field of this invention is the in vitro amplification of nucleic acids. In particular, this invention relates 
to novel methods for cloning and replicating nucleic acids in the course of sequencing and physical 
5 mapping. It also relates to efficient methods for physical mapping of highly complex and lengthy genomes. 

The ability to map and sequence genomic DNA has important applications in the biological and medical 
sciences. The localization of human genes responsible for rare diseases is an important starting point for 
gene isolation and cloning, and the identification of relevant gene products and mutations, leading ultimately 
to improved understanding of the molecular basis of disease. By identifying genes or regions of human 
70 chromosomal DNA involved in hereditary forms of cancer. Alzheimer's disease, and other diseases, new 
methods of diagnosis and treatment may be developed. 

Mapping a genome refers to pinpointing the location of genes and other features of interest on 
particular chromosomes. Sequencing refers to determining the order of nucleotides on the chromosomes. 
The two main types of genome maps are genetic linkage maps and physical maps. Genetic linkage maps 
75 are generally made by studying the frequency with which two different traits are Inherited together, or 
linked. Physical maps are derived mainly from chemical measurements made on the DNA molecules that 
comprise the genome. Accordingly, physical maps can be of several different types and include restriction 
maps as well as lower resolution maps obtained by in situ hybridization. All these maps share the common 
goal of placing information about genes in a systematic linear order according to their relative positions 
20 along a chromosome. 

Restriction maps are based on sites in DNA that are cut by special proteins called restriction enzymes. 
Each enzyme recognizes a specific short sequence of nucleotides termed a "recognition sequence" and 
cuts each strand of the DNA at some point, termed a "cleavage site" or "restriction site," that is within the 
recognition sequence or some distance away from it. Since many different nucleotide sequences are 
25 recognized by one or another restriction enzyme, and those sequences are generally dispersed randomly 
throughout a genome, a physical map may be constructed by determining the relative locations of different 
restriction sites precisely. 

The genome of the bacterium E. coli comprises about 4.7 million base pairs. The smallest human 
chromosome Is ten times that size, with the complete (haploid) human genome comprising about 3 billion 

30 base pairs. Because of the large size of most genomes, the restriction enzymes that are of particular value 
in restriction mapping are those having recognition sites that occur relatively infrequently, or are othen^ise 
distantly spaced throughout the genome, such that the genome may be cut into relative large DNA 
fragments, preferably 100,000 to 2 million or more base pairs long. The fragments of DNA that are 
produced upon digestion of a DNA substrate with a restriction enzyme are termed "restriction fragments." 

35 In general, the fewer the number of genomic restriction fragments generated, the easier is the task of 
correctly ordering them in a physical map. 

The largest genome that has been mapped with restriction enzymes that cleave DNA infrequently is the 
single chromosome of E. coll. Smith, et a|.. Science 236:1448-1453 (1987). The approach taken In that 
instance was to digest Intact E. coli chromosomal DNA with the restriction enzyme Wofi, which cuts within 

40 an eight nucleotide recognition sequence, thereby producing a range of DNA fragments from 15.000 base 
pairs to 1 million base pairs. Most of the information needed to order the DNA restriction fragments came 
from hybridization studies using labelled probes corresponding to E. coli genes that had previously been 
cloned and characterized. 

Thus, the first step toward constructing a physical map of a genome generally involves the isolation and 
45 cloning of discrete genomic DNA fragments. In theory, sensitive DNA-probe technologies make it possible 
to construct physical maps while cloning only a small fraction of the genome that is being mapped. In 
practice, however, such an approach is suitable only for the coarsest level of physical mapping. At higher 
resolutions, most physical mapping is likely to be carried out by analyzing a collection of overlapping DNA 
clones that cover the entire genome. Physical mapping would entail the ordering of the individual DNA 
50 clones according to their positions In the original genome. The Individual clones are especially useful 
t>ecause they provide an inexhaustible source of the DNA from each genomic region. Having available a 
collection of genomic DNA clones Is also a prerequisite to determining the nucleotide sequence of the 
genome, since the clones would provide the actual DNA fragments that would be purified and prepared for 
sequencing. 

55 The principal limitation on the construction of a high-resolution physical map of a genome from an 
ordered collection of DNA clones is the failure of present methods to provide the means for cloning very 
large DNA fragments. Using standard recombinant DNA techniques, for example, the size of the largest 
DNA fragment that may t>e clonal ly propagated in a bacterial host cell is about 50,000 base pairs or less. 
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That result is typically achieved by splicing the DNA fragment into a cosmid - a modified bacteriophage 
lambda vector specifically designed to accommodate DNA Inserts on the order of 40,000 to 50,000 base 
pairs in length. 

The shortcomings of this cosmid cloning system in constructing a physical map of a genome are two- 

5 fold. First, very large numbers of cosmid clones must be constructed to generate a complete set of 
mutually overlapping DNA fragments for the entire genome. In the case of the haploid human genome, for 
example, It has been estimated that anywhere from 75,000 to 375,000 or more different cosmid clones 
would be required for a high-resolution physical map. Pines, 1987, "Mapping the Human (Howard Hughes 
Medical Institute, Bethesda, MD). Second, cosmid clones are known to frequently accumulate deletions 

70 which may result from selection for a shorter size for faster replication and/or metabolic imbalances caused 
by the increased dosage of a particular gene on the cosmid that affects the growth of the bacterial host 
cells. Therefore, cosmid clones, especially those containing essential genes, are usually difficult to maintain. 

Recently, a method has been described for cloning large fragments of exogenous DNA into yeast by 
means of artificial chromosome vectors. Burke, et al., 1987, Science 236:806-812. The use of this "yeast 

15 artificial chromosome" (YAC) cloning system offers a significant advantage over the cosmid cloning system 
in that genomic DNA fragments that range in size up to several hundred thousand base pairs may be 
successfully propagated. There are, however, several limitations on the use of the YAC cloning system, 
particularly as it may be applied to genome mapping. For example, both the number of cloned molecules 
per yeast cell and the number of yeast cells per colony are much lower in YAC cloning than In cosmid 

20 cloning in bacteria, thus limiting the quantities of cloned DNA available for analysis. Furthermore, as with 
any in vivo cloning method, the DNA fragments cloned in YAC vectors can undergo rearrangements or 
deletions in the host cell that may make it difficult to obtain a set of clones representing the entire genome. 

It is one object of the present invention, therefore, to provide a method for efficiently cloning large 
fragments of genomic DNA that avoids the problems inherent in the existing in vivo cloning technology. It is 

25 a further object of the present invention to provide a method for amplifying a desired nucleic acid sequence, 
such as that represented in a cloned genomic DNA fragment, thereby producing sufficient quantities of the 
sequence for physical or chemical analysis. The ability to carry out this method in vitro makes it generally 
useful for amplifying nucleic acids from any source, without the problems of selectivity, rearrangements, 
and deletions that may result from propagating an exogenous nucleic acid sequence In a host cell. At the 

30 same time, the ability to efficiently amplify nucleic acid sequences ranging in size up to several hundred 
thousand base pairs or more makes the method ideally suited for such applications as constructing a high- 
resolution physical map of a complex genome or isolating Intact medically or biologically Important genes 
or gene clusters. 

Figures la-lb (referred to hereafter as Figure 1) show 64 oligonucleotide linkers having cohesive end 
35 sequences complementary to each of the possible restriction fragment cohesive ends resulting from 
digestion of a genomic DNA substrate with the restriction endonuclease Sfil. 

The present invention is directed to improved methods for cloning genomic DNA and for mapping 
genomic DNA restriction fragments. These methods may be used In conjunction with or independently of 
the method of DNA amplification described in the parent application and which entails the synthesis of 
40 novel replicable DNAs comprising a bacteriophage phi29 replication origin and genomic DNA heterologous 
to phi29. 

The method for cloning genomic DNA comprises the steps of: 

(a) digesting genomic DNA with a restriction enzyme that generates multiple different cohesive-end 
sequences; 

45 (b) contacting the digested genomic DNA with a linker, said linker comprising a reporter nucleotide 
sequence and a cohesive-end sequence complementary to a cohesive-end sequence generated by the 
restriction enzyme of step (a), under conditions such that a linker is joined to a restriction fragment of 
genomic DNA; and 
(c) amplifying the product of step (b). 

50 The method for mapping genomic DNA restriction fragments comprises the steps of: 

(a) digesting genomic DNA with a restriction enzyme that generates multiple different cohesive-end 
sequences; 

(b) amplifying the restriction fragment products of step (a); 

(c) determining the cohesive-end sequence of each of said restriction fragments: and in accordance 
55 therewith, 

(d) aligning the restriction fragments in a physical map. 

The method for amplification of a nucleic acid sequence, described and claimed In the parent and which 
may be used with the methods of the present application, comprises the steps of: 



3 



EP 0 593 095 A1 



(a) synthesizing a doubie-stranded nucleic acid, a bacteriophage phi29 replication origin and a nucleotide 
sequence which is heterologous to phi29; and 

(b) synthesizing DMA from said double-stranded nucleic acid under the control o1 said replication origin. 
The advantages obtained by utilizing the methods of the present Invention for cloning genomic DNA 

5 and mapping genomic DNA restriction fragments as compared to methods already known in the art is that 
the present invention allows the cloning and mapping of genomic DNA of virtually any length, without resort 
to the burdensome and error-prone steps of inserting the genomic DNA into specialized vectors and 
replicating it in vivo , and without the need for prior knowledge of the nucleotide sequence or genetic 
organization of any part of the genomic DNA. 

70 The advantage obtained by the method for amplification of a heterologous DNA under the control of a 
bacteriophage phi29 replication origin is that the in vitro synthesis of DNA products, in the presence of 
phi29 DNA polymerase and deoxyrlbonucteoside triphosphates, occurs continuously, using as a template 
DNA up to 100,000 base pairs or more in length. The method of the present invention therefore avoids the 
multiple exacting primer hybridization reactions required by the polymerase chain reaction (PGR) method of 

75 Mullis, et a|.. U.S. Pat. No. 4.683,195. that is commonly used In the art, as well as the Inherent limitations of 
the PGR method as applied to the amplification of lengthy DNA substrates. 

The term "genomic DNA" as used herein refers to any DNA comprising a sequence that is normally 
present in the genome of a prokaryotic or eukaryotic cell or a virus. The genomic DNA of a eukaryotic cell 
includes, for example, nuclear and extranuclear chromosomal DNA, such as that present in mitochondria 

20 and chloropiasts. Also included within the scope of the term "genomic DNA" is any cDNA prepared from a 
messenger RNA or from the RNA genome of a virus. Methods for the extraction and/or purification of 
genomic DNA have been described, for example, by Gross-Bellard et al., 1978, Eur. J. Biochem. 36:32-38, 
Smith et al. 1987, Meth. Enzymol. 151:461-489, and Moon, et al.. 1987, Nuc. Acids Res. 15:611-630. 
Methods for the preparation of cDNA are well known in the art. See. for example, Maniatis. et al.. Molecular 

25 Cloning: A Laboratory Manual (New York, Cold Spring Harbor Laboratory, 1982). 

The term "heterologous" as used herein in reference to a nucleic acid sequence refers to a nucleic acid 
sequence not ordinarily replicated under the control of a bacteriophage phi29 replication origin. Typically 
the heterologous nucleic acid sequence will comprise a DNA sequence present in the nuclear or 
extranuclear genome of a prokaryotic or eukaryotic organism, or the genome of a virus other than 

30 bacteriophage phi29. The heterologous nucleic acid sequence may be prepared by any suitable method, 
such as recovery from a naturally occurring nucleic acid by use of a restriction enzyme, or in vitro 
synthesis, including, for example, the synthesis of a complementary DNA (cDNA) from an RNA. 

The term "reporter sequence" as used herein refers to any nucleic acid sequence by which it is 
possible to identify, either directly or indirectly, another nucleic acid molecule with which the reporter 

35 sequence Is associated. The reporter sequence may be of defined sequence or defined function or both. 
For example, if the reporter sequence comprises a defined nucleotide sequence, the nucleic acid molecule 
with which it is associated may be identified by a hybridization method, wherein the reporter sequence 
hybridizes with a complementary nucleic acid that is immobilized on a solid support. The hybridization 
method may be carried out by any suitable method, including those described by Choutelie. et al.. 1978, 

40 Gene 3:113-122, and European Pat. App. 0 221 308 (Carrico). Alternatively, the reporter sequence may be 
detected by its function, with or without knowledge of its sequence. Examples include the gene for a 
detectable phenotype, or preferably a replication origin or a promoter. If the reporter sequence comprises a 
replication origin or a promoter, for example, the nucleic acid molecule with which the reporter sequence is 
associated is readily identified on the basis of its replication or transcription under the control of the reporter 

45 sequence. 

The term "cohesive-end" as used herein in reference to linkers and restriction fragments refers to a 
single-strand extension at the end of a double-stranded nucleic acid molecule that extends beyond the 
region of base-pairing between the complementary strands. Such single-strand extensions are termed 
cohesive ends because they are capable of hybridizing with one another sequence through base-pairing of 

50 complementary nucleotides, thereby facilitating the Intramolecular or Intermolecular joining of nucleic acids. 
The nucleotide sequence of the single-strand extension is termed the "cohesive-end sequence." Restriction 
fragments having cohesive ends are preferably obtained by digestion of double-stranded DNA with a 
suitable restriction enzyme, such as any of those hereinafter disclosed. Linkers having cohesive ends may 
be obtained by the same method, or may be produced synthetically, such as by the annealing of single- 

55 stranded oligonucleotides. 

The term "oligonucleotide" as used herein in reference to linkers and primers is defined as a nucleic 
acid molecule comprised of two or more deoxyribonucleotides or ribonucleotides. A desired oligonucleotide 
may be prepared by any suitable method, such as purification from a naturally occurring nucleic acid or de 
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novo synthesis. Several methods have been described in the literature, for example, for the synthesis of 
oligonucleotides of defined sequence using various techniques in organic chemistry. Narang, et al., 1979, 
Meth. Enzymol. 68:90-109; Caruthers et al.. 1985, IVIeth. Enzymol. 154:287; Froehler. et al., 1986. Nuc. 
Acids Res. 14:5399-5407. Oligonucleotides prepared by any method may subsequently be joined together 

5 by ligation or otherwise to form a single oligonucleotide of any required length and sequence. 

The term "linker" as used herein refers to an oligonucleotide, whether occurring naturally or produced 
synthetically, which comprises double-stranded DNA. 

The term "primer" as used herein refers to an oligonucleotide, whether occurring naturally or produced 
synthetically, which is substantially complementary or homologous to all or part of a nucleic acid sequence 

10 to be amplified. The primer must be sufficiently long to hybridize with a template nucleic acid comprising 
the sequence to be amplified, and to prime the synthesis of an extension product in the presence of an 
agent for polymerization. Typically the primer will contain 15-30 or more nucleotides, although it may 
contain fewer nucleotides. It Is not necessary, however, that the primer reflect the exact sequence of the 
nucleic acid sequence to be amplified or its complement. For example, non-complementary bases can be 

;5 interspersed into the primer, or complementary bases deleted from the primer provided that the primer is 
capable of hybridizing with the nucleic acid sequence to be amplified or its complement, under the 
conditions chosen. 

The term "origin-primer" as used herein refers to an oligonucleotide whether occurring naturally or 

produced synthetically which comprises a replication origin, as defined further herein, joined to the 5' end of 
20 a primer. Under suitable conditions and in the presence of an agent for polymerization, the origin primer is 

capable of acting as a point of initiation of an origin-primer extension product that includes the nucleic acid 

sequence to be amplified or its complementary sequence, and the replication origin of the origin-primer. 
The term "secondary primer" as used herein refers to an oligonucleotide whether occurring naturally or 

produced synthetically, which under suitable conditions and In the presence of an agent for polymerization, 
25 is capable of acting as a point of initiation for a secondary primer extension product that includes the 

nucleic acid sequence to be amplified or its complementary sequence. 

The term "extension product" as used herein refers to a nucleic acid molecule, the synthesis of which 

is initiated at the 3'-0H terminus of a primer, using as a template for synthesis the nucleic acid molecule to 

which the primer is hybridized. 
30 The term "agent for polymerization" as used herein is generally understood to refer to any enzyme that 

catalyzes the synthesis of a nucleic acid molecule from deoxyribonucleotides or ribonucleotides, using an 

existing nucleic acid as a template. Examples of such enzymes include DNA polymerase. RNA polymerase, 

and reverse transcriptase. 

In one of its aspects, the present invention Is a method for cloning genomic DNA which has as its initial 
35 step digesting genomic DNA with a restriction enzyme that generates multiple different cohesive-end 
sequences. Such enzymes are well known in the art and Include, for example, the restriction enzymes Bgl I, 
BsfX I. Hga I. and Sfi I. 

The predictable cleavage of double-stranded DNA by a restriction enzyme results from the enzyme's 
recognition of a certain sequence of base pairs (recognition sequence) in the DNA substrate. For any 
40 individual restriction enzyme, this specificity is characteristic, and essentially invariant among DNAs. The 
recognition sequences for most restriction enzymes consist of a specific sequence of four to eight base 
pairs or more, and may additionally include one or more random base pairs. 

In addition to the recognition sequence, a restriction enzyme is also characterized by the sites at which 
it introduces breaks in the phosphodiester bonds of each strand of the DNA substrate (cleavage sites) to 
45 generate discrete restriction fragments. The cleavage site for a particular restriction enzyme may occur 
within the enzyme's recognition sequence or some distance away from it. In the case of a restriction 
enzyme that is capable of generating multiple different cohesive-end sequences, the sequence immediately 
surrounding the cleavage sites necessarily includes one or more random base pairs. 

By way of example, the recognition sequence for the restriction enzyme Sfi I comprises a specific 
so sequence of four base pairs separated by five random base pairs from a specific sequence of four base 
pairs as shown: 



I 

65 5* . . . GGCCNNNNNGGCC ... 3' 

3* , . . CCGGNNNNNCCGG ... 5' 

T 
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wherein N may be any one of the four nucleotides A (adenine), T (thymine), G (guanine), or C (cytoslne). 
Sfl\ introduces staggered cleavages within the recognition sequence at the sites indicated by the vertical 
arrows. Accordingly, each resulting Sfi\ restriction fragment terminates at one or both ends with a cohesive 
end having the structure 

* 5' . . . GGCCNNNN 3' 

3' . . . CCGGN 5' 

wherein the cohesive-end sequence, 5'-NNN-3', comprises any one of 64 possible trinucleotide sequences. 

10 In general, if the substrate DNA is a linear molecule, the restriction fragments resulting from digestion of 
the substrate DNA with a restriction enzyme that generates multiple different cohesive-end sequences will 
be of two types: restriction fragments that comprise one of the original ends of the linear substrate DNA and 
one restriction enzyme-generated cohesive end, and so-called "internal" restriction fragments that comprise 
two restriction enzyme-generated cohesive ends. If the substrate DNA is circular, all the restriction 

IS fragments will comprise two restriction enzyme-generated cohesive ends. 

As will be apparent from this description, both the number of different restriction fragments that are 
produced upon complete digestion of the substrate DNA with a restriction enzyme and the number of 
different cohesive-end sequences that are possible for those restriction fragments will depend upon the 
restriction enzyme that is used. 

20 The number of different restriction fragments that are produced upon complete digestion of the 
substrate DNA with a restriction enzyme In tum depends upon the number of occurrences of the restriction 
enzyme recognition sequence in the substrate DNA. The number of occurrences of a specific recognition 
sequence may be estimated from the length of the restriction enzyme recognition sequence and the size of 
the substrate DNA. Given that DNA Is comprised of four different nucleotides, the statistical frequency of a 

25 specific recognition sequence of length n base pairs is given by the fomnula: 



frequency 




30 

For example, a recognition sequence of four base pairs is expected to occur with a frequency of 1/256, a 
recognition sequence of five base pairs is expected to occur with a frequency of 1/1024, a recognition 
sequence of six base pairs is expected to occur with a frequency of 1/4096. and so on. The expected 
number of occurrences of a specific recognition sequence in the substrate DNA is then obtained by 

35 multiplying the calculated recognition sequence frequency by the size of the substrate DNA in base pairs. 

The number of different cohesive-end sequences that are possible among the restriction fragments 
produced by a restriction enzyme capable of generating multiple different cohesive-end sequences is a 
function of the number of allowed nucleotide variations within the cohesive-end sequence. Most commonly, 
each of the random nucleotides within the cohesive-end sequence may be any one of the four nucleotides 

40 adenine, thymine, guanine, or cytoslne. in which case the expected number of different cohesive-end 
sequences is equal to 4", wherein n Is the number of random nucleotides within the cohesive-end 
sequence. 

Any restriction enzyme capable of generating multiple different cohesive-end sequences may be used 
in the present invention, although the choice of a restriction enzyme in a particular instance will preferably 

45 be made on the basis of the size and complexity of the substrate DNA. 

In general, it is desirable that the recognition sequence of the restriction enzyme chosen occur 
relatively infrequently in the substrate DNA. in order to reduce the number of DNA restriction fragments that 
are produced and that are subject to subsequent manipulation and/or analysis. In mapping studies of 
prokaryotic or eukaryotic chromosomal DNA, for example, a restriction enzyme having a recognition 

50 sequence of at least six base pairs will usually be required, preferably at least eight base pairs, in order that 
the resulting restriction fragments have an average size within the range of about 10,000 base pairs to 
100,000 base pairs or more. Optionally, the number of restriction fragment products may be reduced (and 
their average size correspondingly Increased) by adjusting the conditions under which the substrate DNA is 
contacted with the restriction enzyme {e.g., reaction time or temperature), such that the DNA substrate is 

55 only partially digested, with one or more of the restriction enzyme cleavage sites remaining uncleaved. 

At the same time, it is desirable that the restriction enzyme be capable of generating restriction 
fragments of the substrate DNA having a sufficient number of different cohesive-end sequences. Generally, 
the greater tiie number of cohesive-end sequences generated by the restriction enzyme, the greater the 
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likelihood of distinguishing the resulting restriction fragments from one another on the basis of their 
cohesive-end sequences, and in turn, the easier it will be to clone the individual DNA restriction fragments 
and/or to order them in a physical map. For the digestion of prokaryotic or eukaryotic chromosomal DNA, 
for example, the restriction enzyme chosen will usually be one capable of generating at least 64 different 
5 cohesive-end sequences, which corresponds to a cohesive-end sequence comprising a minimum of three 
random nucleotides. 

After digesting genomic DNA with a suitable restriction enzyme, the method for cloning genomic DNA 
has as its next step contacting the resulting DNA restriction fragment(s) with an oligonucleotide linker 
comprising a reporter sequence and a cohesive-end sequence complementary to one of the multiple 

JO different cohesive-end sequences generated by the restriction enzyme. 

The oligonucleotide linker and the DNA restriction fragment(s) are desirably Incubated together under 
conditions such that the cohesive-end of the oligonucleotide linked hybridizes specifically with the com- 
plementary cohesive-end of a restriction fragment. The linker and a DNA restriction fragment with which it 
has hybridized are then covaiently joined using any suitable method, preferably by ligation with DNA ligase. 

15 The presence of the reporter sequence provides the means for identifying the resulting product from 
amongst a mixture of DNA restriction fragments, and preferably provides the means for isolating and/or 
selectively amplifying the resulting product 

If digestion of the genomic DNA produces a mixture of different restriction fragments comprising 
different cohesive-end sequences, each different restriction fragment of the mixture conveniently may be 

20 cloned according to the present invention by contacting the mixture with a combination of two or more 
linkers having cohesive-end sequences complementary to those of the different restriction fragments. 

Because the cohesive-end sequences of the different restriction fragments of the mixture usually will 
not be known with certainty, it is desirable under such circumstances that the combination of linkers 
comprise a library of linkers having cohesive-end sequences complementary to substantially every possible 

25 cohesive-end sequence generated by the particular restriction enzyme used. In this manner, it may be 
assured that substantially every restriction fragment cohesive-end is joined with a linker. Depending on the 
number of different restriction fragments resulting from digestion of the genomic DNA, it may further be 
desirable that Individual aliquots of the mixture be separately contacted with each individual linker from a 
library of linkers, preferably two or more linkers comprising different cohesive-end sequences, in separate 

30 reaction vessels, as for example, the separate wells of a microtiter plate. In this manner, the number of 
different restriction fragments which become joined with linkers In any one reaction may be reduced, in turn 
reducing or eliminating the need in a subsequent step to separate each different restriction fragment, one 
from another, in order to achieve the cloning of each individual genomic DNA restriction fragment. 

The final step of the method for cloning genomic DNA Is the amplification of a genomic DNA restriction 

35 fragment to which a linker and reporter sequence has been joined by any suitable means, either In vivo , 
such as by replication in a host cell, or in vitro . In vitro amplification of a restriction fragment to which a 
reporter sequence of defined sequence is joined may be accomplished, for example, by the methods 
disclosed by Mullis. et ai., supra , or Miller, et al.. POT Publ. No. WO89/06700, published 27 July 1989, 
using a primer comprising a sequence complementary to that of the reporter sequence. However, neither of 

40 these methods is particularly well-suited for the amplification of lengthy DNA fragments. For example, the 
method of Mullis, et al. Is generally limited to the amplification of DNA fragments no larger than several 
thousand base pairs due to the low processlvity and relatively high error rate of the DNA polymerase used 
in the amplification reaction. Higuchi, et al., 1988. Nuc. Acids Res. 16:7351-7367; Saiki, et al., 1988, Science 
239:487-491 . Furthermore, the method of Mullis, et al. requires multiple cycles of nucleic acid denaturation 

45 and primer hybridization, resuKing in burdensome labor or automation costs. 

In a particularly preferred embodiment of the present invention, the reporter sequence used in the 
cloning of genomic DNA comprises a bacteriophage phl29 replication origin and amplification of the 
restriction fragment to which it is joined is accomplished in vitr o using bacteriophage phl29 DNA 
polymerase. 

50 The term "replication origin" generally refers to a sequence In DNA at which DNA synthesis is initiated 
by an agent for polymerization. Gutierrez, et al., 1988, Nuc. Acids Res., 16:5895-5914, disclose that the 
minimal phi29 replication origins utilized by phi29 DNA polymerase are located within the terminal 12 base 
pairs at each end of phl29 DNA. The so-called "left origin of replication" comprises the sequence 5'- 
AAAGTAAGCCCC-3* and the "right origin of replication" comprises the sequence 5'-AAAGTAGGGTA03', 

55 with replication proceeding in the 5' - 3' direction as shown. However, because nucleotide changes may be 
made at one or more positions within these sequences without abolishing their function, all such sequence 
variants are likewise included within the scope of the tenm "bacteriophage phi29 replication origin" as used 
herein. 



7 



EP 0 593 095 A1 



Blanco et al., 1984, Gene 29:33-40, and Blanco and Salas, 1984. Proc. Nat Acad. Scl 81:5325-5329, 
together describe the purification of bacteriophage phi29 DNA polymerase from E. coli cells transformed 
with a recombinant plasmid containing the phi29 DNA polymerase gene. Garcia, et al.. 1983, Gene 21:65- 
76, and Prieto. et al., 1984, Proc. Nat. Acad. Sci 81:1639-1643, together describe the purification of 
5 bacteriophage phi29 terminal protein from E. coli cells transformed with a recombinant plasmid containing 
the phi29 terminal protein gene. 

Blanco and Salas, 1985, Proc. Nat. Acad. Scl. 82:6404-6408, and Blanco, et al., 1988, in "EMBO 
Workshop - Gene Organization and Expression in Bacteriophages/' at p.63, describe the replication of 
bacteriophage phi29 DNA in the presence of purified phi29 terminal protein and phi29 DNA polymerase. 
70 The replication of double-stranded bacteriophage phi29 DNA is initiated at the replication origins present at 
both ends of the DNA by a protein priming mechanism. In the presence of phl29 terminal protein, phi29 
DNA polymerase and the four deoxynucleoside triphosphates, a terminal protein-dAMP initiation complex is 
formed that can be elongated to full-length phi29 DNA by phi29 DNA polymerase by a mechanism of strand 
displacement. 

15 In preferred embodiments of the invention an amplification method is employed for amplifying 
heterologous DNA sequences on the order of 10.000 base pairs to 100,000 base pairs or more in length, as 
well as heterologous DNA sequences less than 10.000 base pairs in length, which method utilizes a 
bacteriophage phi29 replication origin and highly processive bacteriophage phl29 DNA polymerase. In 
accordance with the invention described in the parent application it is now feasible to amplify any DNA 

20 sequence by incorporating at one or both ends thereof a bacteriophage phi29 replication origin. Thus, the 
invention described in the parent provides new DNA constructs that do not naturally occur and that can be 
replicated under the control of a bacteriophage phi29 replication origin. 

The bacteriophage phi29 replication origin may be obtained from natural sources or may be produced 
synthetically, and may be joined to the end of a heterologous DNA fragment in a ligation reaction with DNA 

25 iigase or otherwise introduced adjacent to a DNA sequence to be amplified by methods known in the art, 
such as site directed mutagenesis, or through the use of an origin-primer according to the methods 
generally described by Muilis, et a}., supra . 

Amplification of a heterologous DNA sequence under the control of a bacteriophage phi29 replication 
origin typically will be performed in vitro in a buffered aqueous solution, In the presence of bacteriophage 

30 phi29 terminal protein, bacteriophage phi29 DNA polymerase, and the four deoxynucleoside triphosphates, 
and under suitable conditions of temperature, salts concentration (e.g., Mg2+. NH4'*'). and pH such that a 
multiplicity of full length copies of the heterologous DNA are synthesized. 

Because replication from a bacteriophage phi29 replication origin proceeds unidlrectionally, using as a 
template one strand of a double-stranded DNA, the amplification of a double-stranded heterologous DNA Is 

35 accomplished by joining to both DNA ends a linker comprising a bacteriophage phi29 replication origin, 
such that both strands of the DNA template are replicated. Accordingly, the amplification of a genomic DNA 
restriction fragment is preferably accomplished by joining to both ends of the restriction fragment linkers 
comprising a bacteriophage phi29 replication origin. A linker comprising a bacteriophage phi29 replication 
origin is conveniently referred to as a "phi29 origin linker." 

40 In one embodiment of the Invention, the genomic DNA restriction fragment will comprise two restriction 
enzyme-generated cohesive ends and the phi29 origin linkers preferably will each have a cohesive-end 
sequence complementary to one of the ends of the genomic DNA restriction fragment. In another 
embodiment of the Invention the genomic DNA restriction fragment will comprise one of the original ends of 
the genomic DNA substrate and one restriction enzyme-generated cohesive end. Typically the original end 

45 of the genomic DNA substrate will be a blunt end, In which case amplification of the genomic DNA 
restriction fragment preferably will be accomplished by joining to the original end a phi29 origin linker 
having a blunt end. and joining to the restriction-enzyme generated cohesive end a phi29 origin linker 
having a complementary cohesive end. The term "blunt end" as used herein refers to an end of a double- 
stranded nucleic acid that lacks any single-stranded overhang. If the original end of the genomic DNA 

50 substrate Is not a blunt end, it is preferably converted to a blunt end, for example, by removal of any single- 
stranded overhang with S1 nuclease, or another suitable single-stranded exonuclease, or by filling in the 
original ends according to known methods. See, for example. Maniatis, et al.. supra . 

If In an amplification step two or more different genomic DNA restriction fragments are amplified In the 
same reaction, it will be necessary to separate the different restriction fragments from one another in order 

55 to produce Individual genomic DNA clones. Separation of the different restriction fragments may be 
accomplished subsequent to amplification by any of the known methods for physically separating nucleic 
acids, such as chromatography, centrifugation, or electrophoresis. Lengthy DNA molecules, in the size 
range from about 10,000 base pairs to 100,000 base pairs or more are preferably separated by the 
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technique of pulsed field gel electrophoresis. Carle and Olson, 1984, Nuc. Acids. Res. 12:5647-5664; Smith, 
et aj., 1986, Genetic Engineering 8:45-70 (Plenum Press, New York). The separated DNA molecules can 
then l3e isolated in any known manner and. if desired, the isolated DNA molecules again may be amplified 
to produce a multiplicity of copies of each individual genomic DNA clone. 

5 Alternatively, the different DNA restriction fragments may be labeled with moieties capable of produc- 
ing, either directly or indirectly, different detectable signals, as for example by Incorporating such moieties 
in the different linkers that are joined with the DNA restriction fragments, and physically separating the DNA 
restriction fragments on the basis of those different signals. Thus, for example, the detectable moiety can 
be a fluorophore, and the DNA restriction fragments separated by flow cytometry. Gray, et al., 1987, 

70 Science 238:323-329. 

In yet a further aspect, the present Invention provides efficient means for the physical mapping of 
genomic DNA restriction fragments having multiple different cohesive-end sequences. After digesting 
genomic DNA with a restriction enzyme capable of generating multiple different cohesive-end sequences, 
the relative positions of the resulting restriction fragments within the substrate genomic DNA are established 

15 by detemnining the sequence at one or both ends of each different restriction fragment, including at least 
the sequence of each cohesive-end. Provided the cohesive-end sequences differ between each different 
restriction fragment, alignment of the restriction fragments is conveniently accomplished by taking advan- 
tage of the fact that restriction fragments that are adjacent within the genomic DNA substrate, and therefore 
share common cleavage sites, will have complementary cohesive-end sequences. 

20 The nucleotide sequence at the end of a genomic DNA restriction fragment may be determined directly 
by any of the known methods of nucleic acid sequencing, for example, the chemical degradation technique 
of Maxam and Gilbert, 1980, Meth. Enzymol. 65:499-560 or the chain termination technique of Sanger, et 
al., 1977, Proc, Nat. Acad. Scl. 74:5463-5467, or may be determined indirectly from the cohesive-end 
sequence of a linker(s) that is joined to the restriction fragment. 

25 The manipulation and analysis of accumulated sequence data, particularly the aligning of individual 
genomic DNA restriction fragments in a physical map. is conveniently accomplished using a computer 
method. For example, Staden, 1982, Nuc. Acids Res. 10:4731-4751, describes a computer method for 
handling DNA sequence information, and aligning DNA restriction fragment clones that are related to one 
another by overiap of their sequences. 

30 The following examples are offered by way of illustration, and are not intended to limit the invention in 
any manner. All patent and literature references described herein are expressly incorporated. 

Examples 

35 Digestion of Adenovlrus-2 DNA with Sfi\ 

Purified adenovirus-2 DNA (International Biotechnologies, Inc., New Haven, CT) is digested with 
restriction endonuclease Sfi\ (New England BloLabs. Beverly. MA) in a reaction consisting of lOmM Tris- 
HCI (pH 7.9). lOmM MgCb, lOmM 2-m6rcaptoethanol. 50mM NaCI. lOOug/ml bovine serum albumin. 
40 20ug/ml adenovirus-2 DNA, and 5 units Sfi\ in a total reaction volume of 100 (xl. for 1 hour at 50 *C. 

Removal of Adenovirus-2 Terminal Protein 

Adenovirus-2 DNA contains a virus-encoded protein covalently bound to the 5' terminus of each strand 
45 of the linear DNA molecule, removal of which is necessary to produce ends capable of ligation with linkers. 
Stillman. et al.. 1981, Cell 23:497-508; Tamanol, et al., 1982. Proc. Nat. Acad. Scl. 79:2221-2225. To 
separate the 5' terminal protein from adenovirus-2 DNA, 20 ug/ml adenovirus-2 DNA in 10mm Tris-HCI (pH 
7.5) is mixed with an equal volume of 1M piperidine, incubated for 2 hours at 37*0, then lyophilized. 
dissolved in water and lyophilized again. 

50 

Oligonucleotide Synthesis 

The single-stranded 12-mer oligonucleotide 5' GGGGCTTACTTT 3' and each of 64 different single- 
stranded 15-mer oligonucleotides having the general sequence 5' AAAGTAAGCCCCNNN 3*, wherein N is 
55 any of the four nucleotides adenine, thymine, guanine, and cytosine, are synthesized on a Biosearch Model 
8600 DNA Synthesizer in accordance with the method of Froehler et al., supra . To prepare linkers, each of 
the different 15-mer oligonucleotides Is annealed with the complementary 12-mer oligonucleotide in 
hybridization buffer, consisting of lOmM Tris-HCI (pH 7.5), lOmM MgCb. 20mM NaCI, at a temperature 



9 



EP 0 593 095 A1 



between 25 * and 30 * C. The nucleotide sequences of the 64 linker products* having the general structure 

5* AAAGTAAGCCCCNNN 3' 
3' TTTCATTCGGGG 5' 

5 

are shown In Rg. 1 . 

Cloning of the Adenovlrus-2 Genome 

70 

Adenovirus-2 terminal protein Is removed from the ends adenovlrus-2 DNA as described above. Linkers 
comprising two bacteriophage phi29 left origins of replication in a tail-to-tail orientation, and having the 
sequence 

5' AAAGTAAGCCCCGGGGCTTACTTT 3' 
3* TTTCATTCGGGGjCCCCGAATGAAA 5' 

are then joined to the resulting blunt ends of the adenovjrus-2 DNA, using 100 pmoles of linker per 
20 microgram of adenovlrus-2 DNA in ligation buffer consisting of 50mM Tris-HCI (pH 7.8), lOmM MgCl2, 2mM 

dithiothreitol, ImM ATP, ImM spermidine, and 50 ug/ml bovine serum albumin. 2 units of T4 DNA ligase 

(New England BioLabs, Beverly, MA) are then added per microgram of adenovirus-2 in the reaction mixture, 

and the reaction mixture is then incubated at 15* C for one hour. 

Following ligation of the bacteriophage phl29 origin linkers to the original ends of the adenovirus-2 DNA. 
25 the adenovirus-2 DNA is digested with 5//1 as described above. The cloning of the resulting Sih restriction 

fragments is then conveniently carried out according to the methods of the Invention in the individual wells 

of 96-well microtiter plates. 

To each separate well of the microtiter plates Is added 1 00 pmoles each of two different linkers of Rg. 

1, such that every possible combination of two different linkers of Rg. 1 is contained in one or another 
30 microtiter well. The number of different combinations of the 64 linkers of Rg. 1, and thus the number of 

microtiter wells that need be prepared for the cloning of Sff\ restriction fragments, is 2016. In general, the 

number of different combinations of N different linkers may be determined according to the formula: 



35 number of combinations = • N . 

2 

1 ug of Sfi 1 digested adenovirus-2 DNA is then added to each separate microtiter well in 25 ul of 
ligation buffer together with 2 units of T4 DNA ligase (New England BioL^bs, Beverly, MA). The reaction 

40 mixture is incubated at 15 *C for one hour to allow the ligation of linkers to the cohesive-ends of the 
adenovlrus-2 DNA restriction fragments. The ligation reaction Is stopped by heating for 15 minutes at 68 'C. 

Following ligation of linkers to the adenovirus-2 DNA restriction fragments, amplification of the ad- 
enovirus-2 restriction fragments is accomplished by adding to the reaction mixture in each microtiter well 
20uM each dATP, dTTP, dGTP, and dCTP, 20mM (NH4)2S04, 5% (vol/vol) glycerol, 300 ng of purified 

45 bacteriophage phi29 terminal protein, and 20 ng of bacteriophage phi29 DNA polymerase. After incubation 
for 2 hours at 30 ' C. the DNA synthesis reaction is stopped by the addition of lOmM EDTA and heating for 
10 minutes at 68 'C. 

Following polyacrylamide gel electrophoresis of the DNA synthesis reaction mixtures, each of the 
amplified adenovirus-2 DNA restriction fragments is visualized by staining the gel with ethidium bromide. 
50 and then purified from the gel by electroelution. Sequencing of the ends of each strand of the restriction 
fragments Is candied out according to the method of Maxam and Gilbert, supra , using a^^P-cordycepin-S'- 
triphosphate for 3'-end labeling. Tu. et al., 1980, Gene 10:177-183. 

Physical Mapping of the Adenovirus-2 Genome 

55 

Digestion of adenovirus-2 DNA with Sfi I generates four restriction fragments, ranging in size from 
approximately 1000 base pairs to approximately 16,000 base pairs. The nucleotide sequence at the 3' ends 
of the each of the restriction fragments is as follows: 
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Fragment 
1 

2 

3 

4 



70 



5" 
3' 
5' 
3« 
5« 
3' 
5' 
3« 



Sequence 

GTA 
GAC 
CTG 
ACQ 



ATG 



CTG 



TGC 

GAC 

3' 
5» 
3' 
5' 



3' 
5» 
3* 
5» 



Alignment of the restriction fragments in a physical map is accomplished by analyzing the nucleotide 
sequence of each restriction fragment for complementary 3*-end sequences. Accordingly, the four Sfi I 
restriction fragments of adenovirus-2 DNA are aligned as follows: 



15 



20 



25 



30 



TGC 



GTA . • • 

( fragment 



1) 

5' 
3' 



3' 
5' 



C^G 



ACG • , . 
(fragment 



4) 

5' 
3' 



3' 
5' 



GAC 



GAC . . • 

(fragment 



2) 

5' 
3« 



3* 
5> 



.... ATG 
CTG .... 
(fragment 3) 



3« 
5* 



Cloning of an internal Sfi I Fragnrient of Adenovirus-2 DNA 

35 

This example illustrates an embodiment of the invention wherein a genomic DNA restriction fragment 
having a predetermined cohesive-end sequence is cloned. 

Adenovirus-2 DNA is digested with Sfi I as described above. From the known nucleotide sequence of 
the adenovlrus-2 genome. Roberts, et ai., 1986. in "Adenovirus DNA," W. Doerfler (ed.)(Martinus Nijhoff 
40 Publishing, Boston, MA), it may be predicted that one of the resulting restriction fragments, extending from 
nucleotide 17305 to nucleotide 23046 of the adenovlrus-2 genome, will have the following structure: 



5' CGGC . . . GCCGGAC 3' 
45 3' GACGCCG . . . CGGC 5\ 

As the first step in the in vitro cloning of this restriction fragment, 1 ug of Sfi I digested adenovirus-2 
DNA is mixed with 100 pmoles of oligonucleotide linker having the sequence 

y AAAGTAAGCCCCCTG 3' 
3* TTTCATTCGGGG 5' 



and 100 pmoles of oligonucleotide linker having the sequence 

55 

5'^ AAAGTAAGCCCCGTC 3* 
3' TTTCATTCGGGG 5* 
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In 25 ul of ligation buffer. 2 units of T4 DNA ligase are then added to the mixture and the ligation reaction 
allowed to proceed for one hour at 15 *C. 

Following ligation of the linkers to the complementary cohesive-ends of the desired adenovirus-2 DNA 
restriction fragment, amplification of the restriction fragment Is accomplished by adding to the ligation 
reaction mixture 20uM each dATP, dTTP, dGTP, and dCTP, 20mM (NH4)2S04, 5% (vol/vol) glycerol, 300 
ng of purified bacteriophage phi29 terminal protein, and 20 ng of bacteriophage phi29 DNA polymerase. 
After incubation for 20 minutes at 30 'C, the DNA synthesis reaction is stopped by the addition of lOmM 
EDTA and heating for 10 minutes at 68 'C. 

Claims 

1. A method for cloning genomic DNA which comprises: 

(a) digesting genomic DNA with a restriction enzyme that generates multiple different cohesive-end 

sequences; 

(b) contacting the digested genomic DNA with a linker, said linker comprising a reporter nucleotide 
sequence and a cohesive-end sequence complementary to a cohesive-end sequence generated by 
the restriction enzyme of step (a), under conditions such that a linker is joined to a restriction 
fragment of genomic DNA; and 

(c) amplifying the product of step (b). 

2. The method of claim 1 wherein the digested genomic DNA is contacted with two or more different 
linkers in step (b). 

3. The method of claim 1 wherein the cohesive-end sequences generated by the restriction enzyme of 
step 

(a) are of the same length. 

4. The method of claim 1 wherein the restriction enzyme of step (a) Is Sfl I. 

5. The method of claim 1 or claim 2 wherein steps (b) and (c) are repeated at least once. 

6. The method of claim 1 or claim 2 wherein the product of step (b) is separated from the digested 
genomic DNA before or after step (c). 

7. The method of claim 6 wherein the product of step (b) is separated by electrophoresis. 

8- A method for mapping genomic DNA restriction fragments which comprises 

(a) digesting genomic DNA with a restriction enzyme that generates multiple different cohesive-end 
sequences. 

(b) amplifying the restriction fragment products of step (a); 

(c) determining the cohesive-end sequence of each of said restriction fragments, and In accordance 

therewith, 

(d) aligning the restriction fragments in a physical map. 

9. The method of claim 8 wherein the method further comprises separating the restriction fragment 
products of step (a) from one another before or after step (b). 

10. The method of claim 8 wherein the method further comprises separating the restriction fragment 
products of step (a) from one another by gel electrophoresis before or after step (b). 

11. The method of claim 8 wherein the genomic DNA of step (a) is eukaryotic. 

12. The method of claim 8 wherein the genomic DNA of step (a) is human. 

13. The method of claim 8 wherein the restriction enzyme of step (a) Is Sfi I. 

14. A method according to claim 8 wherein the cohesive-end sequence of each of said restriction 
fragments is determined indirectly from the cohesive-end sequence of a linker joined to said each 
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restriction fragment. 
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HGURE la 



5' AAAGTAAGCCCCAAA 3 
3' TTTCATTCGGGG 5 

5' AAAGTAAGCCCCAAG 3 
3' TTTCATTCGGGG 5 

5' AAAGTAAGCCCCATA 3 
3 ' TTTCATTCGGGG 5 

5* AAAGTAAC3CCCCATG 3 
31 TTTCATTCGGGG 5 

5* AAAGTAAGCCCCAGA 3 
3 • TTTCATTCGGGG 5 

5» AAAGTAAGCCCCAGG 3 
31 TTTCATTCGGGG 5 

5« AAAGTAAGCCCCACA 3 
3. TTTCATTCGGGG 5 

5' AAAGTAAGCCCCAGG 3 
3» TTTCATTCGGGG 5 

5' AAAGTAAGCCCCTAA 3 
3' TTTCATTCGGGG 5 

5* AAAGTAAGCCCCTAG 3 
31 TTTCATTCGGGG 5 

5» AAAGTAAGCCCCTTA 3 
3» TTTCATTCGGGG 5 

5' AAAGTAAGCCCCTTG O 
31 TTTCATTCGGGG 5 

5* AAAGTAAGCCCCTGA 3 
3» TTTCATTCGGGG 5 

5' AAAGTAAGCCCCTGG 3 
31 TTTCATTCGGGG 5 

5' AAAGTAAGCCCCTGA 3 
3, TTTCATTCGGGG 5 

5' AAAGTAAGCCCCTGG 3 
3' TTTCATTCGGGG 5 



5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 

5 
3 



AAAGTAAGCCCCAAT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCAAG 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCATT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCATC 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCAGT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCAGG 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCAGT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCAGG 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTAT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTAG 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTTT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTTC 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTGT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTGG 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTCT 3 
TTTCATTCGGGG 5 

AAAGTAAGCCCCTCC 3 
TTTCATTCGGGG 5 
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FIGURE lb 



5« 


AAAGTAAGCCCCGAA 


3 • 


5* 


AAAGTAAGCCCCGAT 


3« 


3' 


TTTCATTCGGGG 


5' 


3 • 


TTTCATTCGGGG 


5' 


5' 


AAAGTAAGCCCCGAG 


3' 


5' 


AAAGTAAGCCCCGAG 


3 » 


3' 


TTTCATTCGGGG 


5' 


3 • 


TTTCATTCGGGG 


5' 


5' 


AAAGTAAGCCCCGTA 


3* 


5' 


AAAGTAAGCCCCGTT 


3* 


3 • 


TTTCATTCGGGG 


5* 


3* 


TTTCATTCGGGG 


5* 


5' 


AAAGTAAGCCCCGTG 


3' 


5' 


AAAGTAAGCCCCGTG 


3' 


3 • 


TTTCATTCGGGG 


5' 


3 • 


TTTCATTCGGGG 


5* 


5' 


AAAGTAAGCCCCGGA 


3« 


5' 


AAAGTAAGCCCCGGT 


3' 


3« 


TTTCATTCGGGG 


5' 


3 • 


TTTCATTCGGGG 


5' 


5' 


AAAGTAAGCCCCGGG 


3» 


5' 


T^GTAAGCCCCGGC 


3' 


3' 


TTTCATTCGGGG 


5' 


3 ' 


TTTCATTCGGGG 


5 ' 


5' 


AAAGTAAGCCCCGGA 


3» 


5* 


AAAGTAAGCCCCGGT 


3' 


3 • 


TTTCATTCGGGG 


5' 


3 • 


TTTCATTCGGGG 


5* 


5' 


AAAGTAAGCCCCGGG 


3 • 


5' 


AAAGTAAGCCCCGCC 


3 


3 • 


TTTCATTCGGGG 


5 • 


3 ' 


TTTCATTCGGGG 


5 ' 


5' 


AAAGTAAGCCCCGAA 


3* 


5* 


AAAGTAAGCCCCGAT 


3' 


3 • 


TTTCATTCGGGG 


5 * 


3 • 


TTTCATTCGGGG 


5 • 


5' 


AAAGTAAGCCCCGAG 


3« 


5' 


AAAGTAAGCCCCGAG 


3 


3 ' 


TTTCATTCGGGG 


5 • 


3 • 


TTT C ATTCGGGG 


5 ' 


5» 


AAAGTAAGCCCCGTA 


3' 


5* 


AAAGTAAGCCCCGTT 


3 


3" 


TTTCATTCGGGG 


5' 


3' 


TTTCATTCGGGG 


5' 


5» 


AAAGTAAGCCCCCTG 


3' 


5* 


AAAGTAAGCCCCCTG 


3 


3' 


TTTCATTCGGGG 


5' 


3* 


TTTCATTCGGGG 


5' 


5' 


AAAGTAAGCCCCGGA 


2* 


5* 


AAAGTAAGCCCCGGT 


3 


3» 


TTTCATTCGGGG 


5» 


3« 


TTTCATTCGGGG 


5' 


5» 


Ai\AGTAAGCCCCCGG 


3« 


5' 


AAAGTAAGCCCCGGG 


3 


3* 


TTTCATTCGGGG 


5* 


3' 


TTTCATTCGGGG 


5' 


5' 


AAAGTAAGCCCCCCA 


3* 


5« 


AAAGTAAGCCCCCCT 


3 


3' 


TTTCATTCGGGG 


5« 


3» 


• TTTCATTCGGGG 


5' 


5' 


AAAGTAAGCCCCCCG 


3 • 


5' 


AAAGTAAGCCCCGCC 


3 


3 • 


TTTCATTCGGGG 


5' 


3' 


TTTCATTCGGGG 


5" 
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